Comparison of VTE risk scores in guidelines for VTE diagnosis in nonsurgical hospitalized patients with suspected VTE

Background The assessment of VTE likelihood with VTE risk scores is essential prior to imaging examinations during VTE diagnostic procedure. Little is known with respect to the disparity of predictive power for VTE diagnosis among VTE risk scores in guidelines for nonsurgical hospitalized patients with clinically suspected VTE. Methods A retrospective study was performed to compare the predictive power for VTE diagnosis among the Wells, Geneva, YEARS, PERC, Padua, and IMPROVE scores in the leading authoritative guidelines in nonsurgical hospitalized patients with suspected VTE. Results Among 3168 nonsurgical hospitalized patients with suspected VTE, VTE was finally excluded in 2733(86.3%) ones, whereas confirmed in 435(13.7%) ones. The sensitivity and specificity resulted from the Wells, Geneva, YEARS, PERC, Padua, and IMPROVE scores were (90.3%, 49.8%), (88.7%, 53.6%), (73.8%, 50.2%), (97.7%,16.9%), (80.9%, 44.0%), and (78.2%, 47.0%), respectively. The YI were 0.401, 0.423, 0.240, 0.146, 0.249, and 0.252 for the Wells, Geneva, YEARS, PERC, Padua, and IMPROVE scores, respectively. The C-index were 0.694(0.626–0.762), 0.697(0.623–0.772), 0.602(0.535–0.669), 0.569(0.486–0.652), 0.607(0.533–0.681), and 0.609(0.538–0.680) for the Wells, Geneva, YEARS, PERC, Padua, and IMPROVE scores, respectively. Consistency was significant in the pairwise comparison of Wells vs Geneva(Kappa 0.753, P = 0.565), YEARS vs Padua(Kappa 0.816, P = 0.565), YEARS vs IMPROVE(Kappa 0.771, P = 0.645), and Padua vs IMPROVE(Kappa 0.789, P = 0.812), whereas it did not present in the other pairs. The YI was improved to 0.304, 0.272, and 0.264 for the PERC(AUC 0.631[0.547–0.714], P = 0.006), Padua(AUC 0.613[0.527–0.700], P = 0.017), and IMPROVE(AUC 0.614[0.530–0.698], P = 0.016), with a revised cutoff of 5 or less, 6 or more, and 4 or more denoting the VTE-likely, respectively. Conclusions For nonsurgical hospitalized patients with suspected VTE, the Geneva and Wells scores perform best, the PERC scores performs worst despite its significantly high sensitivity, whereas the others perform intermediately, albeit the absolute predictive power of all isolated scores are mediocre. The predictive power of the PERC, Padua, and IMPROVE scores are improved with revised cutoffs.

Strong risk factors for VTE occurrence mainly comprise but not limited to: active cancer, previous VTE, antiphospholipid syndrome, recent hospitalization for acute illness especially myocardial infarction, heart failure or atrial fibrillation/flutter, recent major trauma or fracture or surgery especially hip or knee replacement, prolonged immobility > 3 days, and heparin-induced thrombocytopenia. Clinical assessment for predisposing risk factors and presentation of symptoms of VTE allows the stratification of patients with suspected VTE into distinct categories which correspond to an actual prevalence of confirmed VTE, and is necessary to estimate patients' risk of VTE before any further investigations. Pretest probability(PTP) assessment of VTE is the first and key step throughout the the whole diagnostic algorithms for VTE, since the post-test probability of VTE or the interpretation of results of imaging testings depends not only on the results itself but also on the pretest probability of VTE. Risk assessment of VTE can be performed either by using empirical clinical gestalt or by using standardized models [2,6]. Notwithstanding the value of empirical clinical gestalt has been confirmed in several large studies [7,8], standardized VTE risk assessment using clinical models or scores or rules is preferred over gestalt, since gestalt lacks standardization or the possibility of imparting standard operating procedure [2,6,9,10].
Accordingly, a series of VTE(including PE) risk scores which were represented by the Wells score [11,12] and the revised Geneva score [13,14] have emerged one after another. An independent VTE risk score usually consists of VTE risk factors, weighing points of risk factors, and defined cutoffs for risk classifications. By far, the VTE risk scores which have been approved by leading authoritative guidelines such as the European Society of Cardiology (ESC)/European Respiratory Society (ERS), the American College of Chest Physicians(ACCP) and the American Society of Hematology(ASH) for patients with suspected VTE mainly include the Wells [2,9,10,15,16], the revised Geneva [2,9,10,15], the YEARS [2], the PERC [2], the Padua [17,18], the IMPROVE [18], the Caprini [19], and the Rogers [19]. Since the latter two are completely targeted for surgical patients, their VTE risk assessment value in nonsurgical patient population are limited. In addition, notwithstanding there is a Geneva VTE risk assessment model (RAM) [20,21], it has not been endorsed by primary authoritative guidelines by far.
To our best knowledge, no study ever compared the predictive power for VTE diagnosis among all these VTE risk scores approved by the leading authoritative guidelines for nonsurgical hospitalized patients with suspected VTE to date. However, clinicians may yield confusion of how to choose them in daily clinical practice, facing a variety of VTE risk scores. Accordingly, the present study was carried out to address this issue.

Study design
A retrospective study was performed to compare the predictive power for VTE diagnosis among six VTE risk scores including the Wells, Geneva, YEARS, PERC, Padua, and IMPROVE RAMs which are approved by the leading authoritative guidelines for nonsurgical hospitalized patients with suspected VTE. Nonsurgical hospitalized patients were reviewed if they had undergone diagnostic imaging investigation of VTE that included computed tomography pulmonary angiography (CTPA), compression ultrasonography (CUS) of lower extremities, and/or planar ventilation/perfusion (V/Q) scan due to the suspicion of VTE which were triggered by typical symptoms or signs of PE and/or DVT, and/or a D-dimer level was 500 ng/mL or more. Clinical suspicion of VTE was yielded by patients' attending physicians at the admission of hospitalizations. VTE was defined as PE and DVT. Nonsurgical patients were defined as patients who were not in a perioperative period. All eligible patients were classified into VTE and non-VTE groups according to their results of VTE imaging examinations. During the present study, the PTP of VTE in each patient was reassessed with the Wells, Geneva, YEARS, PERC, Padua, and IMPROVE scores, respectively. The results of VTE likelihood assessment by each score in each patient was defined as either VTE-unlikely or VTE-likely, respectively. Then such VTE unlikeliness or likeliness resulted from all scores was contrasted to the actual absence or presence of VTE for all patients, thereby comparing their predictive power for VTE diagnosis. The pairwise comparison of diagnostic consistency and dominance were conducted between very two RAMs. The predictive power for VTE diagnosis was reanalyzed without using the originally-defined cutoffs of all scores, to explore whether or not their performance would be improved with a revised cutoff, thereby validating the appropriateness of their original cutoffs. The parameters at the time of hospital admission were harvested as the variables involved in RAMs in the present study.
With respect to the Wells score, the simplified version was adopted in the current study due to its increased adoption into clinical practice than the original one [22], albeit the term of "Wells" is still used for the rest of this article. It consists of previous PE or DVT(1 point), heart rate > 100 beats per minute (bpm)(1 point), surgery or immobilization within the past 4 weeks(1 point), hemoptysis(1 point), active cancer(1 point), clinical signs of DVT(1 point), and alternative diagnosis less likely than PE(1 point). A total score of 1 or less denotes VTEunlikely, whereas 2 or more denotes VTE-likely [2]. Likewise, the simplified revised version was employed for the Geneva score for the same reason [23], albeit the term of "Geneva" is still used for the rest of this article. It contains previous PE or DVT(1 point), heart rate75-94 bpm(1 point), heart rate ≥ 95 bpm(2 points), surgery or fracture within the past month(1 point), hemoptysis(1 point), active cancer(1 point), unilateral lower-limb pain(1 point), pain on lower-limb deep venous palpation and unilateral oedema(1 point), age > 65 years(1 point). A total score of 2 or less denotes VTE-unlikely, whereas 3 or more denotes VTE-likely [2]. The YEARS score consists of clinical signs of DVT(1 point), hemoptysis(1 point), and PE is the most likely diagnosis(1 point). A total score of 0 denotes VTE-unlikely, whereas 1 or more denotes VTE-likely [2,6,24]. The PERC rule comprises age < 50 years(1 point), pulse < 100 bpm(1 point), oxygen saturation(SaO 2 ) > 94%(1 point), no unilateral leg swelling(1 point), no hemoptysis(1 point), no recent trauma or surgery(1 point), no history of VTE(1 point) and no oral hormone use(1 point). A total score of 8 denotes VTEunlikely, whereas 7 or less denotes VTE-likely [2,25]. The Padua score contains reduced mobility(3 points), active cancer(3 points), previous VTE excluding superficial thrombophlebitis(3 points), known thrombophilic condition(3 points), recent trauma and/or surgery within the past month(2 points), age > 70 years(1 point), heart and/or respiratory failure(1 point), acute myocardial infarction or ischemic stroke(1 point), ongoing hormonal treatment(1 point), body mass index > 30(1 point), acute infection and/or rheumatologic disorder(1 point).
A total score of 3 or less denotes VTE-unlikely, whereas 4 or more denotes VTE-likely [17,18]. The IMPROVE score consists of previous VTE(3 points), known thrombophilia(2 points), lower limb paralysis(2 points), active cancer(2 points), immobilization ≥ 7 days(1 point), intensive care unit(ICU)/coronary care unit(CCU) stay(1 point), age > 60 years(1 point). A total score of 1 or less denotes VTE-unlikely, whereas 2 or more denotes VTElikely [18]. The summary of characteristics of all six scores are presented in Table 1. The presence frequency of VTE risk elements in all six scores in descending order are demonstrated in Fig. 1.
The present study was conducted by the investigators of Shanghai Xinhua Hospital, Shanghai Pulmonary Hospital, and Shanghai Punan Hospital. Relevant data were retrieved from the electronic medical record systems of each participating hospital. All authors vouched for the completeness and accuracy of the data. No one who is not an author contributed to the manuscript writing. The study protocol was approved by the institutional review board of each participating hospital.

Study population
Eligible patients from participating hospitals were incorporated into the present study as per the inclusion and exclusion criteria. The inclusion criteria consisted of the following:1) All eligible patients were 18 years old or older; 2) All eligible patients underwent diagnostic imaging investigation of VTE that included CTPA, CUS of lower extremities, and/or planar V/Q scan to confirm the absence or presence of VTE during the hospitalization;3) All eligible patients were nonsurgical hospitalized ones. The exclusion criteria consisted of the following:1) Patients were excluded if they had a known previous history of chronic thromboembolic disease (CTED) [2] or were diagnosed with CTED during the hospitalization; 2) Patients were excluded if they were finally diagnosed with non-thrombotic venous embolism primarily including tumor embolism, amniotic fluid embolism, fat embolism, septic embolism, and angiosarcoma during the hospitalization.

Statistical analyses
Comparison of measurement data between groups was conducted by using T-test. Comparison of rates was conducted by Chi-square test. Number of patients with true positive (TP), false positive (FP), false negative (FN), and true negative (TN) resulted from each score, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), false positive rate (FPR)(misdiagnosis rate), false negative rate (FNR) (omission diagnostic rate), positive likelihood ratio (PLR), negative likelihood ratio (NLR), diagnostic odds ratio (DOR), number needed to diagnosis(NND), success rate (SR)(crude agreement), failure rate(FR), adjusted agreement (AA), Youden index (YI), and Harrell's concordance-index(C-index) were compared among the Wells, Geneva, YEARS, PERC, Padua, and IMPROVE scores. Pairwise comparison of diagnostic consistency and dominance tests between every two scores were performed by using Cohen's Kappa coefficient analysis and McNemar's test, respectively. Logistic regression analysis was applied to explore the correlation between VTE occurrence and VTE risk elements in scores. Receiver operator characteristic (ROC) curve analyse was performed to reveal and compare the predictive power for VTE diagnosis among all VTE scores without using the originally-defined cutoffs. Statistical analyses were performed by using SPSS Table 1 Characteristics of all six VTE Scores VTE Venous thromboembolism, PE Pulmonary embolism, DVT Deep venous thrombosis, ICU/CCU Intensive care unit/coronary care unit " + " denotes the presence of the VTE risk elements, "-" denotes the absence of the VTE risk elements

Correlation between VTE risk elements in scores and VTE occurrence
An univariate and the subsequent multivariate Logistic regression analyses demonstrated that, most VTE risk elements in all six scores were correlated with VTE occurrence except for the elements of acute infection and/or rheumatologic disorder, body mass index, ongoing hormonal therapy, and oxygen saturation, in the present study population. Correlation between VTE risk elements in scores and VTE occurrence are presented in Table 3.

Comparison of predictive power for VTE diagnosis among all scores
The unlikeliness or likeliness of VTE reassessed by all scores were compared to the actual VTE absence or presence, respectively. The VTE diagnostic prevalence were 55.7%, 52.2%, 53.1%, 85.1%, 59.4%, and 56.4%, whereas the VTE exclusion prevalence were 44.3%, 47.8%,      Table 4.

Pairwise comparison of diagnostic consistency and dominance between every two scores
The pairwise comparison of diagnostic consistency between every two scores showed that, excellent con-

0.789), whereas it did not appear in the other couples.
Likewise, the pairwise comparison of diagnostic dominance between every two scores suggested that, there were dominance differences between the two scores of each rest pair except for the aforementioned ones. The pairwise comparison of diagnostic consistency and dominance between every two scores are presented in Table 5.

ROC analyses of predictive power for VTE diagnosis of all scores without fixed cutoffs
By using ROC curve analyses, the predictive power for VTE diagnosis of all scores were reanalyzed without applying the original fixed cutoffs of each RAM in the present study population. The results turned out to indicate that the original cutoffs, sensitivity, specificity, and YI still remained same as those in Table 4

Discussion
The results of the present study revealed that the sequence in the descending order of YI and C-index for the predictive power of VTE diagnosis were the Geneva, Wells, IMPROVE, Padua, YEARS, and PERC scores. No statistical difference with respect to predictive power for VTE diagnosis was found in the pairs of Wells vs Geneva, YEARS vs Padua, YEARS vs IMPROVE, and Padua vs IMPROVE, whereas it presented in the other pairs. In other words, the Geneva and Wells performed best, the PERC performed worst, whereas the others performed intermediately. Revised cutoffs improved the predictive power for VTE diagnosis in the PERC, Padua, and IMPROVE scores. Of note, the absolute predictive performance of all these isolated scores were poor. The prevalence of VTE in the current cohort was 13.7%, that is basically consistent with previous studies, in which overall VTE event rates in hospitalized medical patients ranged from 10 to 15% [26]. Accordingly, the degree of VTE risk in this study population is representative of nonsurgical hospitalized patients. Most of the items in all these scores were correlated with VTE occurrence in the current patient population, once again validating their eligibility in those scores. The comparison among more than two kinds of these six scores have been rare to date yet. No identical previous studies are available for reference except for some studies analogical to the present one. A recent systematic review compared the capacity of ruling out PE among the Wells, Geneva, YEARS, and PERC scores across different healthcare settings. In the hospitalized healthcare setting, the Wells plus PTPadjusted D-dimer(sensitivity 95.64%, specificity 39.50%), the Geneva plus PTP-adjusted D-dimer(sensitivity 95.73%, specificity 37.29%), and the YEARS plus PTPadjusted D-dimer(sensitivity 96.94%, specificity 35.83%) yielded similar diagnostic accuracy [27]. It was basically consistent with the present results, except that the YEARS was inferior to the other two in the current study. Since the aforementioned systematic review incorporated PTP-adjusted D-dimer especially for the YEARS, plus it only targeted PE, it is not appropriate to be regarded as an eligible reference.
Among these six scores, comparison of Wells versus Geneva, and Padua versus IMPROVE were performed most frequently. The results of the comparison between Wells and Geneva were mixed among the related studies. Among them, the results of some previous studies supported the perspective that the Wells and Geneva score had similar prediction accuracy for patients with suspected PE [28][29][30][31][32], whereas the results of some other studies were in favor of that the Wells score was more accurate than the Geneva score [33][34][35][36][37][38]. In the present study, the predictive power for VTE diagnosis was alike between the Wells and Geneva, albeit the Geneva score seemed slightly better than the Wells score without statistical difference. With respect to Padua versus IMPROVE, several previous studies involving the comparison of them suggested that the predictive power for VTE diagnosis were equally matched between the two scores [39][40][41]. The results of the present study were consistent with those of previous studies, albeit the IMPROVE seemed slightly better than Padua without statistical significance.
The correlation between predisposing factors or typical indications of VTE in VTE risk scores and VTE occurrence affect their predictive power for VTE diagnosis. The stronger the correlation, the better the predictive power for VTE diagnosis. According to the Table 1, it can be observed that the sum of presence frequency of VTE risk elements in descending order are 30, 29, 28, 26, 26, and 11 times for the Geneva, PERC, Wells, Padua, IMPROVE, and YEARS, respectively. The presence frequency per element in descending order are 4.29, 4.00, 3.71, 3.67, 3.63, and 2.60 for the Geneva, Wells, IMPROVE, YEARS, PERC, and Padua, respectively. The sum of presence frequency of VTE risk elements and the presence frequency per element in authoritative VTE scores especially the latter can embody the relevancy and acceptance degree of these risk elements in VTE risk assessment. According to the Fig. 1, the VTE risk elements which present for at least three times or more are recent immobilization, trauma or surgery(5 times), previous VTE history(5 times), DVT symptoms and/or signs(5 times), hymoptysis(4 times), active cancer(4 times), age (4 times), and heart rate or pulse(3 times).
The determination of cutoffs for risk classification in VTE scores also has an impact on their predictive power for VTE diagnosis. For most VTE scores, the higher the cutoffs, the higher the specificity, the lower the sensitivity, and vice versa. The more appropriate the cutoffs, the better the predictive power for VTE diagnosis. A balance point needs to be quested between missed diagnoses and excessive examinations. Of note, different patient populations with different clinical VTE probability may require different cutoffs. The ratio of VTE-likely cutoffs to total points in descending order are 0.88, 0.33, 0.33, 0.29, 0.20, and 0.17 for the PERC, Geneva, YEARS, Wells, Padua, and IMPROVE RAMs, respectively. Since the PERC score is distinctive among all six scores by reason of that all the items it contains are negative risk factors for VTE occurrence whereas the other five scores all have positive ones for VTE occurrence, its ratio of VTE-likely cutoffs to total points should have been 0.12 which is actually the least in all six scores instead of 0.88, if its items had been set up to be positive risk factors for VTE occurrence.
Ever since the Wells and Geneva score emerged, their role in the PTP prediction of PE have been externally validated in a series of previous studies [2,6,9,27,37]. The Geneva and Wells have the most(30 times) and third most(28 times) presence frequency of VTE risk elements, as well as highest (4.29) and second highest(4.00) presence frequency per element among all these six scores, respectively. The Geneva and Wells scores both contain the elements of recent immobilization, trauma or surgery(5 times), previous VTE history(5 times), DVT symptoms and/or signs(5 times), hymoptysis(4 times), active cancer(4 times), and heart rate or pulse(3 times), except the Wells score has the element of alternative diagnosis less likely than PE(2 times), whereas the revised Geneva has that of age (4 times). In other words, the Geneva and Wells score especially the former have the most highly-acknowledged risk elements for VTE diagnosis among all six scores. The universally-accepted VTE risk factors in scores which represent most highlycorrelated predictors of VTE occurrence could conduce to improve their predictive accuracy for VTE diagnosis. Meanwhile, it can be found that the Wells and Geneva scores are highly similar with each other in composition, of which six elements(26 times) of the total seven ones are identical with each other. This may be accountable for their similar predictive performance in VTE diagnosis. In addition, ROC analyses justified the rationality of their cutoffs. Notwithstanding all this, howsoever, caveat is necessary that the Wells score incorporates a subjective criterion "alternative diagnosis less likely than PE" which is dependent on the experience of clinicians, and is intractable to be standardly operated or imparted, being different from the Geneva.
The Padua and IMPROVE scores are two authoritative ones acknowledged by leading guidelines for medical patients, and have been sufficiently validated in previous external studies [17,18,21]. A closer observation at the composition of Padua and IMPROVE revealed that they have the same(26 times) presence frequency of VTE risk elements, whereas the presence frequency per element of the IMPROVE (3.71) is higher than that of the Padua(2.60). These two scores both contain the elements of previous VTE history(5 times), recent immobilization, trauma or surgery(5 times), age(4 times), active cancer(4 times), and thrombophilia(2 times). Their discrepancy in composition is that the Padua score incorporates the elements of ongoing hormonal therapy(2 times), acute infection and/or rheumatologic disorder(1 time), acute myocardial infarction and/or ischemic stroke(1 time), body mass index(1time), and heart and/ or respiratory failure(1 time), whereas the IMPROVE incorporates elements of DVT symptoms and signs(5 times) and ICU/CCU stay(1 time). Taken together, the majority of elements(20 of the total 26 times) which are highly-acknowledged risk factors of VTE occurrence are identical between the Padua and IMPROVE. Their similar performance may be attributable to such structural similarity, albeit the IMPROVE seemed slightly better than the Padua without statistical significance.
Overall, the Geneva and Wells generally outperformed the IMPROVE and Padua with respect to the predictive power for VTE diagnosis. These four scores merely share three VTE risk elements which are previous VTE history(5 times), recent immobilization, trauma or surgery(5 times), and active cancer(4 times), whereas had a large proportion of elements not in common. By comparison, the Geneva and Wells both have modifiable risk factors of VTE occurrence like hemoptysis and heart rate or pulse that can reflect the point-of-care status quo of patients, whereas the IMPROVE and Padua do not incorporate these elements. Lack of such elements may abate their predictive power for VTE diagnosis. Of note, notwithstanding these four RAMs all reflect VTE risk, the IMPROVE and Padua were endorsed by the guidelines in terms of VTE prevention or thromboprophylaxis [17,18], whereas the Geneva and Wells were endorsed in the guidelines of diagnosis and management of PE [2,9,10,15]. The results of present study justified that the IMPROVE and Padua were inferior to the Geneva and Wells with respect to predictive power for VTE diagnosis. Nonetheless, revised cutoffs could improve their performance in certain degree.
The YEARS score is a condensed derivative of the Wells score. Generally, the YEARS algorithm denotes the application of YEARS score in association with a D-dimer level instead of the isolated score alone [2,24]. Of note, the YEARS in the current study was an isolated score rather than an algorithm since the current study was intended to compare the isolated VTE risk scores without D-dimer. As such, the current results are not applicable to the YEARS algorithm. The YEARS score has only three elements which are DVT symptoms and/or signs(5 times), hemoptysis(4 times) and alternative diagnosis less likely than PE(2 times). Its presence frequency per element is 3.67 which is merely less than those of the Geneva and Wells despite its presence frequency sum of VTE risk elements is only 11. In a retrospective study which compared the predictive accuracy for PE occurrence between the YEARS algorithm(RAM + D-dimer) and the Wells score, the YEARS algorithm was more sensitive than the Wells score (97.44% vs 74.36%), whereas was less specific than the latter(13.97% vs 33.94%). Besides, the YEARS algorithm yielded better negative predictive value than the Wells score (98.0%vs 92.4%). Nevertheless, it was the YEARS algorithm that was employed instead of the isolated YEARS score alone in the study [42]. Accordingly, the study is not an ideal parallel to the current one. In the present study, the diagnostic performance of the isolated YEARS was outperformed by that of the Geneva and Wells, probably due to its excessively simplistic structure, albeit being similar to that of the IMPROVE and Padua. Nevertheless, its cutoff was justified to be appropriate. Of note, the YEARS also has the subjective element which is the "alternative diagnosis less likely than PE".
The PERC score was originally developed for the PE exclusion among patients with a low clinical probability of PE and has been validated in a randomized controlled trial [43]. It has high sensitivity but low specificity for PE occurrence among patients with intermediate or high clinical probability of PE [2,44]. Likewise, its predictive power for VTE diagnosis was the worst among all these six scores in the present study in which the subjects were hospitalized patients who carried considerable probability of VTE occurrence, albeit its NPV, FNR, NLR, and DOR were satisfactory yet. Among all these scores, although the presence frequency sum of VTE risk elements in the PERC is 29 times which is only less than that in the Geneva(30 times), whereas its presence frequency per element is 3.63 which is the second least one of among all scores. More importantly, the original cutoff of the PERC that resulted from the patient population with a low clinical probability of PE resulted in its poor predictive power in the current patient population. With the original cutoff of the PERC, substantial excessive unnecessary imaging examinations yielded despite missed diagnoses were drastically avoided, whereas a revised cutoff could improve its performance.
Several limitations need to be acknowledged for the current study. First of all, prospective studies are warranted since the current one was a retrospective review. Secondly, since the current subjects were nonsurgical hospitalized patients, the results may not be applicable to surgical ones, and/or ambulatory outpatients. Besides, generally all nonsurgical hospitalized patients should be included in the evaluation by clinical VTE risk scores. However, only nonsurgical hospitalized patients with suspected VTE were included in this study. Therefore, the results may not be applicable to general nonsurgical hospitalized patients. Thirdly, the Wells and Geneva scores adopted for the present study were simplified version instead of original version, the results might have been different if their original versions had been employed. Likewise, the Wells DVT RAM [6,16] was not incorporated in the current study either. Last but not least, D-dimer was not involved since the intention of the current study was to compare VTE risk scores. It is worth noting that the absolute performance of each isolated score per se was unsatisfactory(C-index < 0.7 for all), being basically consistent with the results of previous studies [45].
Accordingly, a combination of risk scores and D-dimer is highly recommended by guidelines at present [2]. The results might have been different if D-dimer had been involved.
In conclusion, the comparison of predictive power for VTE diagnosis among six VTE risk scores in guidelines indicates that the Geneva and Wells scores perform best, the PERC score performs worst, whereas the others perform intermediately, in nonsurgical hospitalized patients with suspected VTE. Little difference presents between the Geneva and Wells scores, as well as among the IMPROVE, Padua, and YEARS scores. Revised cutoffs improve the performance of the PERC, Padua, and IMPROVE scores. Nevertheless, the absolute performance of all isolated scores are mediocre. The results may assist clinicians with the selection of relevant scores in the corresponding clinical settings.